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TITLE OF THE INVENTION 

CODED DOMAIN ECHO CONTROL 

CROSS-REFERENCE TO RELATED APPLICATIONS 

This is a utility application corresponding to provisional application no. 60/142,136 entitled 
5 "CODED DOMAIN ENHANCEMENT OF COMPRESSED SPEECH " filed July 2, 1 999. 

STATEMENT REGARDING FEDERALLY 
SPONSORED RESEARCH OR DEVELOPMENT 

Not Applicable. 

BACKGROUND OF THE INVENTION 
10 The present invention relates to coded domain enhancement of compressed 

speech and in particular to coded domain echo contol. 

This specification will refer to the following references: 

[1] GSM 06.10, "Digital cellular telecommunication system (Phase 2); Full rate speech; Part 2: 
Transcoding", ETS 300 580-2, March 1998, Second Edition. 

15 [2] GSM 06.60, "Digital cellular telecommunications system (Phase 2); Enhanced Full Rate (EFR) 
speech transcoding", June 1998. 
[3] GSM 08.62, "Digital cellular telecommunications system (Phase 2+); Inband Tandem Free 

Operation (TFO) of Speech Codecs", ETSI, March 2000. 
[4] J- R. Deller, J. G. Proakis, J. H. L. Hansen, "Discrete-Time Processing of Speech Signals", Chapter 
20 7, Prentice-Hall Inc. 1987, 

[5] GSM 06.12, "European digital cellular telecommunications system (Phase 2); Comfort noise 
aspect for full rate speech traffic channels", ETSI, September 1994. 



25 



In the GSM digital cellular network, speech transmission between the mobile 
stations (handsets) and the base station is in compressed or coded form. Speech 
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coding techniques such as the GSM FR [1] and EFR [2] are used to compress the 
speech. The devices used to compress speech are called vocoders. The coded speech 
requires less than 2 bits per sample. This situation is depicted in Figure 1 . Between 
the base stations, the speech is transmitted in an uncoded form (using PCM 
5 companding which requires 8 bits per sample). 

The temis coded speech and uncoded speech may be described as follows: 

Uncoded speech: refers to the digital speech signal samples typically used in 
telephony; these samples are either in linear 13-bits per sample form or companded 
form such as the 8-bits per sample -law or A-law PCM form; the typical bit-rate is 
10 64 kbps. 

Coded speech: refers to the compressed speech signal parameters (also 
referred to as coded parameters) which use a bit rate typically well below 64kbps such 
as 13 kbps in the case of the GSM FR and 12.2 kbps in the case of GSM EFR; the 
compression methods are more extensive than the simple PCM companding scheme; 
15 examples of compression methods are linear predictive coding, code-excited linear 
prediction and multi-band excitation coding [4]. 

The Tandem-Free Operation (TFO) standard [3] will be deployed in GSM 
digital cellular networks in the near future. The TFO standard applies to mobile-to- 
mobile calls. Under TFO, the speech signal is conveyed between mobiles in a 
20 compressed form after a brief negotiation period. This eliminates tandem voice codecs 
during mobile-to-mobile calls. The elimination of tandem codecs is known to improve 
speech quahty in the case where the original signal is clean. The key point to note is 
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that the speech transmission remains coded between the mobile handsets and is 
depicted in Figure 2. 

Under TFO, the transmissions between the handsets and base stations are 
coded, requiring less than 2 bits per speech sample. However, 8 bits per speech 
5 sample are still available for transmission between the base stations. At the base 
station, the speech is decoded and then A-law companded so that 8 bits per sample are 
necessary. However, the original coded speech bits are used to replace the 2 least 
significant bits (LSBs) in each 8-bit A-law companded sample. Once TFO is 
established between the handsets, the base stations only send the 2 LSBs in each 8-bit 
10 sample to their respective handsets and discard the 6 MSBs. Hence vocoder 
tandeming is avoided. The process is illustrated in Figure 3. 

The echo problem and its traditional solution are shown in Figure 4. In 
wireline networks, echo occurs due to the impedance mismatch at the 4-wire-to-2- 
wire hybrids. The mismatch results in electrical reflections of a portion of the far-end 

15 signal into the near-end signal. Depending on the channel impulse response of the 
endpath and network delay, the echo can be annoying to the far end listener. The 
endpath impulse response is estimated using a network echo canceller (EC) and is 
used to produce an estimate of the echo signal. The estimate is then subtracted from 
the near-end signal to remove the echo. After EC processing, any residual echo is 

20 removed by the non-linear processor (NLP). 

In the case of a digital cellular handset, the echo occurs due to the feedback 
from the speaker (earpiece) to the microphone (mouthpiece). The acoustic feedback 
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can be significant and the echo can be annoying, particularly in the case of hands-free 
phones. 

Figure 5 shows the feedback path from the speaker to the microphone in a 
digital cellular handset. The depicted handset does not have echo cancellation 
5 implemented in the handset. 

Under TFO in GSM networks, if echo cancellation is implemented in the 
network, a traditional approach requires decoding the coded speech, processing the 
resulting uncoded speech and then re-encoding it. Such decoding and re-encoding is 
necessary because traditional echo cancellers can only operate on the uncoded speech 
10 signal. This approach is shown in Figure 6. Some of the disadvantages of this 
approach are as follows. 



1. 



This approach is computationally expensive due to the need for two 



decoders and an encoder. Typically, encoders are at least an order of 



magnitude more complex computationally than decoders. Thus, the 



15 



presence of an encoder, in particular, is a major computational burden. 



2. 



The delay introduced by the decoding and re-encoding ^ processes is 



undesirable. 



3. 



A vocoder tandem (i.e. two encoder/decoder pairs placed in series) is 



introduced in this approach, which is known to degrade speech quality due 



20 



to quantization effects. 
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In another straightforward approach, comfort noise generation may be used to 
mask the echo. Comfort noise generation is used for silence suppression or 
discontinuous transmission purposes (e.g. [5]). It is possible to use such techniques to 
completely mask the echo whenever echo is detected. However, such techniques 
5 suffer from "choppiness" particularly during double-talk conditions, as well as poor 
and unnatural background transparency. 

The proposed techniques are capable of performing echo control (acoustic or 
linear) directly on the coded speech (i.e. by direct modification of the coded 
parameters). Low computational complexity and delay are achieved. Tandeming 
10 effects are avoided or minimized, resulting in better perceived quality after echo 
control. Excellent background transparency is also achieved. 

Speech compression, which falls under the category of lossy source coding, is 
commonly referred to as speech coding. Speech coding is performed to minimize the 
bandwidth necessary for speech transmission. This is especially important in wireless 
15 telephony where bandwidth is scarce. In the relatively bandwidth abundant packet 
networks, speech coding is still important to minimize network delay and jitter. This 
is because speech communication, unlike data, is highly intolerant of delay. Hence a 
smaller packet size eases the transmission through a packet network. The four ETSI 
GSM standards of concern are listed in Table 1 . 

20 Table 1 : GSM Speech Codecs 



Codec Name 


Coding Method 


Bit Rate (kbits/sec) 


Half Rate (HR) 


VSELP 


5.6 


Full Rate (FR) 


RPE-LTP 


13 


Enhanced Full Rate (EFR) 


ACELP 


12.2 


Adaptive Multi-Rate (AMR) 


MR-ACELP 


5.4-12.2 
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In speech coding, a set of consecutive digital speech samples is referred to as a 
speech frame. The GSM coders operate on a frame size of 20ms (160 samples at 8kHz 
sampling rate). Given a speech frame, a speech encoder determines a small set of 
5 parameters for a speech synthesis model. With these speech parameters and the 
speech synthesis model, a speech frame can be reconstructed that appears and sounds 
very similar to the original speech frame. The reconstruction is performed by the 
speech decoder, hi the GSM vocoders listed above, the encoding process is much 
more computationally intensive than the decoding process. 

10 The speech parameters determined by the speech encoder depend on the 

speech synthesis model used. The GSM coders in Table 1 utilize linear predictive 
coding (LPC) models. A block diagram of a simplified view of a generic LPC speech 
synthesis model is shown in Figure 7. This model can be used to generate speech-like 
signals by specifying the model parameters appropriately. In this example speech 

15 synthesis model, the parameters include the time-varying filter coefficients, pitch 
periods, codebook vectors and the gain factors. The synthetic speech is generated as 
follows. An appropriate codebook vector, c(«) , is first scaled by the codebook gain 
factor G . Here n denotes sample time. The scaled codebook vector is then filtered by 
a pitch synthesis filter whose parameters include the pitch gain, , and the pitch 

20 period, T . The result is sometimes referred to as the total excitation vector, «(«) . As 
implied by its name, the pitch synthesis filter provides the harmonic quality of voiced 
speech. The total excitation vector is then filtered by the LPC synthesis filter which 
specifies the broad spectral shape of the speech frame and the broad spectral shape of 
the corresponding audio signal. 
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For each speech frame, the parameters are usually updated more than once. 
For instance, in the GSM FR and EFR coders, the codebook vector, codebook gain 
and the pitch synthesis filter parameters are determined every subframe (5ms). The 
LPC synthesis filter parameters are determined twice per frame (every 10ms) in EFR 
5 and once per frame in FR. 

A typical sequence of steps used in a speech encoder is as follows: 

1 . Obtain a frame of speech samples. 

2. Multiply the frame of samples by a window (e.g. Hamming window) and 
10 determine the autocorrelation function up to lag M . 



3. 



Determine the reflection coefficients and/or LPC coefficients from the 



autocorrelation function. (Note that reflection coefficients are an alternative 



representation of the LPC filter coefficients.) 



4. 



Transform the reflection coefficients or LPC filter coefficients to a different 



15 



form suitable for quantization (e.g. log-area ratios or line spectral frequencies) 



5. 



Quantize the transformed LPC coefficients using vector quantization 



techniques. 



6. 



Add any additional error correction/detection, framing bits etc. 



7. 



Transmit the coded parameters. 



20 



The following sequence of operations is typically performed for each 



subframe by the speech encoder: 
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1 



Determine the pitch period. 



2. 



Determine the corresponding pitch gain. 



10 



3. Quantize the pitch period and pitch gain. 

4. Inverse filter the original speech signal through the quantized LPC synthesis 
filter to obtain the LPC residual signal. 

5. Inverse filter the LPC residual signal through the pitch synthesis filter to 
obtain the pitch residual. 

6. Determine the best codebook vector. 

7. Determine the best codebook gain. 

S. Quantize the codebook gain and codebook vector. 
9. Update the filter memories appropriately. 

A typical sequence of steps used in a speech decoder is as follows: 
First, perform any error correction/detection and fi-aming. 
Then, for each subfi-ame: 

1 . Dequantize all the received coded parameters (LPC coefficients, pitch period, 
pitch gain, codebook vector, codebook gain). 
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2. Scale the codebook vector by the codebook gain and fiher it using the pitch 
synthesis filter to obtain the LPC excitation signal. 

3. Filter the LPC excitation signal using the LPC synthesis filter to obtain a 
preliminary speech signal. 

5 4. Construct a post-filter (usually based on the LPC coefficients). 

5. Filter the preliminary speech signal to reduce quantization noise to obtain the 
final synthesized speech. 

As an example of the arrangement of coded parameters in the bit-stream 
transmitted by the encoder, the GSM PR vocoder is considered. For the GSM FR 

10 vocoder, a frame is defined as 160 samples of speech sampled at 8kHz, i.e. a frame is 
20ms long. With A-law PCM companding, 160 samples would require 1280 bits for 
transmission. The encoder compresses the 160 samples into 260 bits. The 
arrangement of the various coded parameters in the 260 bits of each frame is shown in 
Figure 8. The first 36 bits of each coded frame consists of the log-area ratios which 

15 correspond to LPC synthesis filter. The remaining 22,4 bits can be grouped into 4 
subframes of 56 bits each. Within each subframe, the coded parameter bits contain the 
pitch synthesis filter related parameters followed by the codebook vector and gain 
related parameters. 

BRIEF SUMMARY OF THE INVENTION 
20 The preferred embodiment is useful in a communications system for 

transmitting a near end digital signal using a compression code comprising a plurality 
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of parameters including a first parameter. The parameters represent an audio signal 
comprising a plurality of audio characteristics. The compression code is decodable by 
a plurality of decoding steps. The communications system also transmits a far end 
digital signal using a compression code. In such an environment, the echo in the near 
5 end digital signal can be reduced by reading at least the first parameter of the 
plurality of parameters in response to the near end digital signal. At least one of the 
plurality of the decoding steps is performed on the near end digital signal and the far 
end digital signal to generate at least partially decoded near end signals and at least 
partially decoded far end signals. The first parameter is adjusted in response to the at 
10 least partially decoded near end signals and at least partially decoded far end signals 
to generate an adjusted first parameter. The first parameter is replaced with the 
adjusted first parameter in the near end digital signal. The reading, generating and 
adjusting preferably are performed by a processor. 

Another embodiment of the invention is useful in a communications system 
15 for transmitting a near end digital signal comprising code samples further comprising 
first bits using a compression code and second bits using a linear code. The code 
samples represent an audio signal having a plurality of audio characteristics. The 
system also transmits a far end digital signal. In such an environment, any echo in the 
near end digital signal can be reduced without decoding the compression code by 
20 adjusting the fust bits and second bits in response to the near end digital signal and 
the far end digital signal. 

BRIEF DESCRIPTION OF THE DRAWINGS 
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Figure 1 is a schematic block diagram of a system for speech transmission in a 
GSM digital cellular network. 

Figure 2 is a schematic block diagram of a system for speech transmission in a 
GSM network under tandem-firee operation (TFO). 

5 Figure 3 is a graph illustrating transmission of speech under tandem-free 

operation (TFO). 

Figure 4 is a schematic block diagram of a traditional solution to an echo 
problem in a wireline network. 

Figure 5 is a schematic block diagram illustrating acoustic feedback from a 
10 speaker to a microphone in a digital cellular telephone. 

Figure 6 is a schematic block diagram of a traditional echo cancellation 
approach for coded speech. 

Figure 7 is a schematic block diagram of a generic linear predictive code 
(LPC) speech synthesis model or speech decoder model. 

1 5 Figure 8 is a diagram illustrating the arrangement of coded parameters in the 

bit stream for GSM FR. 

Figure 9 is a schematic block diagram of a preferred form of coded domain 
echo control system for acoustic echo environments made in accordance with the 
invention. 
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Figure 10 is a schematic block diagram of another preferred form of coded 
domain echo control system for echo due to 4-wire-to-2-wire hybrids made in 
accordance with the invention. 

Figure 11 is a schematic block diagram of a simplified end path model with 
5 flat delay and attenuation. 

Figure 12 is a graph illustrating a prehminary echo likelihood versus near end 
to far end subframe power ratio. 

Figure 13 is a flow diagram illustrating a preferred form of coded domain echo 
control methodology. 

10 Figure 14 is a graph illustrating an exemplary pitch synthesis filter magnitude 

firequency response. 

Figure 15 is a graph illustrating exemplary magnitude fi-equency responses of 
an original LPC synthesis filter and flattened versions of such a filter. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 
15 The preferred embodiments will be described with reference to the following 

abbreviations: 



ACELP 


Algebraic Code Excited Linear Prediction 


AE 


Audio Enhancer 


ALC 


Adaptive or Automatic Level Control 


CD 


Coded Domain or Compressed Domain 


CDEC 


Coded Domain Echo Control 


EFR 


Enhanced Full Rate 


ETSI 


European Telecommunications Standards Institute 


FR 


Full Rate 
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GSM 


Global System for Mobile Communications 


ITU 


International Telecommunications Union 


MR-ACELP 


Multi-Rate ACELP 


PCM 


Pulse Code Modulation (ITU G.711) 


RPE-LTP 


Regular Pulse Excitation - Long Term Prediction 


TFO 


Tandem Free Operation 


VSELP 


Vector Sum Excitation Linear Prediction 



Speech Synthesis Transfer Function 

Although many non-linearities and heuristics are involved in the speech 
synthesis at the decoder, the following approximate transfer function may be 
attributed to the synthesis process: 



The codebook vector, c(n) , is filtered by H{z) to result in the synthesized 
speech. The key point to note about this generic LPC speech synthesis or decoder 
10 model for speech decoding is that the available coded parameters that can be modified 
to achieve echo control are: 

1 . c(«) : codebook vector 

2. G : codebook gain 

3. g^: pitch gain 



4. T: pitch period 
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5. K , = l,...,M} : LPC coefficients 

Most LPC-based vocoders use parameters similar to the above set, parameters 
that may be converted to the above forms, or parameters that are related to the above 
forms. For instance, the LPC coefficients in LPC-based vocoders may be represented 
5 using log-area ratios (e.g. the GSM FR) or line spectral frequencies (e.g. GSM EFR); 
both of these forms can be converted to LPC coefficients. An example of a case where 
a parameter is related to the above form is the block maximum parameter in the GSM 
FR vocoder; the block maximum can be considered to be directly proportional to the 
codebook gain in the model described by equation (1). 

10 Thus, although the discussion of coded parameter modification methods is 

mostly limited to the generic speech decoder model, it is relatively straightforward to 
tailor these methods for any LPC-based vocoder, and possibly even other models. 

It should also be clear that non-linear processing methods such as center- 
clipping used with uncoded speech for echo control cannot be used on the coded 

15 parameters because the coded parameter representation of the speech signal is 
significantly different. Even the codebook vector signal, c(«), is not amenable to 
center-clipping due to the significant quantization involved. In many vocoders, the 
majority of the codebook vector samples are already zero while the non-zero pulses 
are highly quantized. Hence such non-linear processing approaches are not applicable 

20 or effective. 

In this specification and claims, the terms linear code and compression code 
have the following meanings: 
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Linear code: By a linear code, we mean a compression technique that results in one 
coded parameter or coded sample for each sample of the audio signal. Examples of 
linear codes are PCM (A-law and ^ -law) ADPCM (adaptive differential pulse code 
modulation), and delta modulation. 

5 

Compression code: By a compression code, we mean a technique that results in 
fewer than one coded parameter for each sample of the audio signal. Typically, 
compression codes result in a small set of coded parameters for each block or frame 
of audio signal samples. Examples of compression codes are linear predictive coding 
10 based vocoders such as the GSM vocoders (HR, FR, EFR). 

Coded Domain Echo Control 



Overview 



Figure 9 shows a novel implementation of coded domain echo control (CDEC) 
1 5 for a situation where acoustic echo is present. A communications system 1 0 transmits 
near end coded digital signals over a network 24 using a compression code, such as 
any of the codes used by the Codecs identified in Table 1 . The compression code is 
generated by an encoder 16 from linear audio signals generated by a near end 
microphone 14 within a near end speaker handset 12. The compression code 
20 comprises parameters, such as the shown in Figure 8. The parameters represent an 
audio signal comprising a plurality of audio characteristics, including audio level and 
power. The compression code is decodable by various decoding steps. As will be 
explained, system 1 0 co'ntrols echo in the near end digital signals due to the presence 
of a far end digital signals transmitted by system 10 over a network 32. The echo is 
25 controlled with minimal delay and minimal, if any, decoding of the compression code 
parameters shown in Figure 8. 



Near end digital signals using the compression code are received on a near end 
terminal 20, and digital signals using an adjusted compression code are transmitted by 
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a near end terminal 22 over a network 24 to a far end handset (not shown) which 
includes a decoder (not shown) of the adjusted compression code. Note that the 
adjusted compression code is compatible with the original compression code. In other 
words, when the coded parameters are modified or adjusted, we term it the adjusted 
5 compression code, but it still is decodable using a standard decoder corresponding to 
the original compression code. A linear far end audio signal is encoded by a far end 
encoder (not shown) to generate far end digital signals using a compression code 
compatible with decoder 18, and is transmitted over a network 32 to a far end terminal 
34. A decoder 18 of near end handset 12 decodes the far end digital signals. As 
10 shown in Figure 9, echo signals from the far end signals may find their way to 
encoder 16 of the near end handset 12 through acoustic feedback. 

A processor 40 performs various operations on the near end and far end 

compression code. Processor 40 may be a microprocessor, microcontroller, digital 
signal processor, or other type of logic unit capable of arithmetic and logical 
15 operations. 

For each type of codec, a different coded domain echo control algorithm 44 is 
executed by processor 40 at all times - under compressed mode and linear mode, 
during TFO as well as non-TFO. A partial decoder 48 is executed by processor 40 to 
read at least a first of the parameters received at terminal 20. Another partial decoder 
20 46 is executed by processor 40 to generate at least partially decoded far end signals. 
Decoder 48 generates at least partially decoded near end signals. (Note that the 
compression codes used by the near end and far end signals may be different, and 
hence the partial decoders may also be different.) Based on the partial decoding. 
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algorithm 44 generates an echo likehhood signal at least estimating the amount of 
echo in the near end digital signal. The echo likelihood signal varies over time since 
the amount of echo depends on the far end speech signal. The echo likelihood signal 
is used by algorithm 44 to adjust the parameter(s) read by algorithm 44. The adjusted 
5 parameter is written into the near end digital signal to form an adjusted near end 
digital signal which is transmitted from terminal 22 to network 24. In other words, 
the adjusted parameter is substitued for the originally read parameter. The partial 
decoders 46 and 48 shown within the Network ALC Device are algorithms executed 
by processor 40 and are codec -dependent. 

10 The partial decoders operate on signals compressed using compression codes. 

In the case where processor 40 is implemented in a TFO environment, partial decoder 
' 46 may decode the linear code rather than the compression code. Also, in this case, 

partial decoder 48 decodes the linear code and only determines the coded parameters 
from the compression code without actually synthesizing the audio signal from the 
15 compression code. 

Blocks 44, 46 and 48 also may be implemented as hardwired circuits. 

Figure 10 shows that the Figure 9 embodiment can be useful for a system in 
which the echo is due to a 4-wire-to-2-wire hybrid. 

The CDEC device/algorithm removes the effects of echo from the near-end 
20 coded speech by directly modifying the coded parameters in the bit-stream received 
from the near-end. Decoding of the near-end and far-end signals is performed in order 
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to determine the likelihood of echo being present in the near-end. Certain statistics are 
measured from the decoded signals to determine this likelihood value. 

Partial Decoding 

The decoding of near-end and far-end signals may be complete or 
5 partial depending on the vocoder being used for the encode and decode operations. 
Some examples of situations where partial decoding suffices are listed below: 

1. In code-excited linear prediction (CELP) vocoders, a post-filtering 
process is performed on the signal decoded using the LPC-based 
model. This post-filtering process reduces quantization noise. 

10 However, since it does not significantly affect the measurement of the 

statistics necessary for determining the likelihood of echo, the post- 
filtering stage can be avoided for economy. 

2. Under TFO in GSM networks, the CDEC device may be placed 
between the base station and the switch (known as the A-interface) or 

15 between the two switches. Since the 6 MSBs of each 8-bit sample of 

the speech signal corresponds to the PCM code as shown in Figure 3, it 
is possible to avoid decoding the coded speech altogether in this 
situation. A simple table-lookup is sufficient to convert the 8-bit 
companded samples to 13-bit linear speech samples using A-law 

20 companding tables. This provides ,an economical way to obtain a 

version of the speech signal without invoking the appropriate decoder. 
Note that the speech signal obtained in this manner is somewhat noisy. 
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but has been found to be adequate for the measurement of the statistics 



necessary for determining the likehhood of echo. 



Determination Of Echo Likelihood 

Assuming that some imcoded version (either fixlly or partially decoded) of the 
5 far-end and near-end signals are available, certain statistics are measured and used to 
determine the likelihood of echo being present in the near-end signal. The echo 
likelihood is estimated for each speech subframe, where the subframe duration is 
dependent on the vocoder being used. A -preferred approach is described in this 
section. 

10 A simplified model of the end-path is assumed as shown in Figure 11. The 

end-path is assumed to consist of a flat delay of r samples and an echo return loss 



In this model, Sj^^{n) and Sfr^in) are the near-end and far-end uncoded 
signals, respectively. It is assumed that the range of r is known for a given 
15 implementation of CDEC, and is specified as follows: 



This assumption is reasonable since the maximum and minimum end-path 
delays depend mostly on the speech encoding, speech decoding, channel encoding, 
channel decoding and other known transmission delays. The ERL range is assumed to 



(ERL), Z. 



(2) 



20 



be: 



0</l<l 



(3) 
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The echo likelihood estimation process uses the following variables: 

Pj^^ is the power of the current sub frame of the near-end signal. 

/^^(O) is the power of the current sub frame of the far-end signal. 

PpE^J") is the power of the m'* sub frame before the current sub frame of the 
5 far-end signal, hi other words, a buffer of past values of far-end subframe power 
values is maintained. The buffer size is = [t^ I A^"] so that the subframe power 
of the far-end signal up to the maximum possible end-path delay is available. Here 
is the number of samples in a subframe. 

R is the near-end to far-end subframe power ratio. 

10 Pi is the preliminary echo likelihood. 

p is the echo likelihood obtained by smoothing the preliminary echo 
likelihood. 

The echo likelihood is estimated for each subframe using the steps below. For 
some vocoders, particularly lower bit rate vocoders such as GSM HR, the processing 
15 may be more appropriately perfonned frame-by- frame rather than subframe-by- 
subframe. 

Determine the power of current subframe as 
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Determine the power of ^^^(w) for the current sub frame as 



Determine the near-end to far-end power ratio as 
R= ^ where B.=\t./N\. The 

5 denominator is essentially the maximum far-end subframe power measured during the 
expected end-path delay time period. 

Shift the far-end power values in the buffer, i.e. 

Pfe(^u^) = Pfb(£u^ -1);-; Pfe(^) = P^EiO) . 

Determine the preliminary echo likelihood as 

{0 ,fori?>63 
-0.016i? + 1.008, for0.5<i?<63. 
1 , for < 0.5 

Smooth the preliminary echo likelihood to obtain the echo likelihood using 
p = 0.9p + 0.1p, 

The graph for the preliminary likelihood as a function of near-end to far-end 
subframe power ratio is shown in Figure 12. 

1 5 Coded Parameter Modification 



In this section, the preferred techniques for direct modification of the coded 
parameters based on the echo likelihood are described. The direct modification of 
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each coded parameter of the generic speech decoder model of Figure 7 is first 
described. Then the corresponding method for modification of the parameters for a 
standard-based vocoder is described. As an example of a standard-based vocoder, the 
GSM FR vocoder is considered. After each parameter is modified and quantized 
5 according to the standard, the appropriate parameters in the bit-stream are modified 
appropriately. The preferred embodiment of the overall process is depicted in Figure 
13. 

Codebook Gain Modification 

The codebook gain parameter, G , for each subframe is reduced by a scale 
10 factor depending on the echo likelihood, p , for the subframe. The modified codebook 
gain parameter, denoted by G„^ , is given by: 

G„^^{l-p)G (4) 

This parameter is then requantized according to the vocoder standard. Note 
that the codebook gain controls the overall level of the synthesized signal in the 
15 speech decoder model of Figure 7, and therefore controls the overall level of the 
corresponding audio signal. Attenuating the codebook gain in turn results in the 
attenuation of the echo. 

For the GSM FR, the block maximum parameter, , is directly proportional 
to the codebook gain parameter of the generic model of Figure 7. Hence the modified 
20 block maximum parameter is computed as 
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= (1-P)^a^. (5) 



^max.nen- ^'^^'^ rcquantized according to the method prescribed in the standard. 
The resulting 6 bit value is reinserted at the appropriate positions in the bit-stream. 

Codebook Vector Modification 

5 The codebook vector, c{n') , is modified by randomizing the pulse positions 

and amplitudes. Randomizing the codebook vector results in destroying the 
correlation properties of the echo. This has the effect of destroying much of the 
"speech-like" nature of the echo. The randomization is performed whenever the 
likelihood of echo is determined to be high, preferably when p>0.8. The 
10 randomization may be performed using any suitable pseudo-random bit generation 
technique. 

Li the case of the GSM FR, the codebook vector for each subframe is 
determined by the RPE grid position parameter (2 bits) and 13 RPE pulses (3 bits 
each). These 41 bits are replaced with 41 random bits using a pseudo-random bit 
15 generator. 

Pitch Synthesis Filter Modification 

The pitch synthesis filter implements any period of long-term correlation in 
the speech signal, and is particularly important for modeling the harmonics of voiced 
speech. The model of this filter discussed in Figure 7 uses only two parameters, the 
20 pitch period, T, and the pitch gain, gp. During voiced speech, the pitch period is 
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relatively constant over several subframes or frames. The pitch gain in most vocoders 
ranges from zero to one or a small value above one (e.g. 1.2 in GSM EFR). During 
strong voiced speech, the pitch gain is at or near its maximum value. 

If only echo is present in the near-end signal, the voiced harmonics of the echo 
5 are generally well modeled by the pitch synthesis filter; the likelihood of echo is 
detected to be high ( p > 0.8 ). 

If both echo and near-end speech are present in the near-end signal during a 
frame period, the likelihood of echo is at moderate levels ( 0.5 ^ p ^ 0.8 ). In such 
situations, the encoding process generally results in modeling the stronger of the two 
10 signals. It is reasonable to assume that, in most cases, the near-end speech is stronger 
than the echo. If this is the case, then the encoding process, due to its nature, tends to 
model mostly the near-end speech harmonics and little or none of the echo harmonics 
with the pitch synthesis filter. 

In order to remove or mask voiced echo, the harmonic nature of the echo is 
15 destroyed. This is achieved by modifying the pitch synthesis filter parameters as 
follows: 

The pitch period is randomized so that long-term correlation in the echo is 
removed, hence destroying the voiced nature of the echo. Such randomization is 
performed only when the likelihood of echo is high, preferably when p > 0.8 . 



The pitch gain is reduced so as to control the strength of the harmonics or the 
strength of the long-term correlation in the audio signal. Such gain attenuation is 
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preferably performed only when the likelihood of echo is at least moderate (p> 0.5 ). 
The new pitch gain is obtained as 



Note that with this approach, the pitch period is not randomized during 
5 moderate echo likelihood but the pitch gain may be attenuated so that the voicing 
quality of the signal is not as strong. 

Figure 1 4 shows the magnitude frequency responses of a pitch synthesis filter 
with pitch period 7 = 41. The dotted hne is the response for a high pitch gain 
igp =0.75) and the solid line illustrates what happens when the pitch gain is 
10 attenuated to gp = 0.3 . The strength of the harmonics and long-term correlation of an • 
audio signal can be controlled by modifying this parameter in this manner. 

In the GSM FR vocoder, the LTP lag parameter of subframe j , denoted by 
Nj , corresponds to the pitch period T of the model of Figure 7. Nj takes up 7 bits in 
the bit-stream and can range from 40 to 120, inclusive. Hence when randomizing Nj , 
15 it must be replaced with a random number that is also in this range. 





otherwise 



(6) 



The LTP gain parameter of subframe j of the GSM FR vocoder, denoted by 



bj , corresponds to the pitch gain gp of Figure 7. The modified LTP gain parameter 



is obtained in a manner similar to equation (6) as 
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6.^ = 11'-^^*' ''".""-^ (7) 
[bj otherwise 

LPC Synthesis Filter Modification 

In the generic speech decoder model of Figure 7, the LPC synthesis filter 
transfer function is 1/(1 — ^^,^*2"* j . This filter provides the broad spectral shaping 
5 for the synthesized signal. The magnitude frequency response of this filter may be 
flattened by replacing the coefficients {aj.} with {y^*^*} with 0 is termed 

the spectral morphing factor. In other words, the modified transfer function is 
l/(l-2^^^,at/5*z~* j. Note that when ^ = 0, the original LPC synthesis filter is 

transformed into an all-pass filter, and when >9 = 1 , the original filter remains 
10 unchanged. For all values of J3 between 0 and 1, the original filter magnitude 
fi-equency response experiences some flattening, with greater flattening as J3 -^0. 
Note that filter stability is maintained in this transformation. 

The effect of such spectral morphing on echo is to reduce or remove any 
formant structure present in the signal. The echo is blended or moiphed to sound like 
15 background noise. As an example, the LPC synthesis filter magnitude frequency 
response for a voiced speech segment and its flattened versions for several different 
values of /? are shown in Figure 15. 

In the preferred implementation, the spectral morphing factor J3 is determined 



as 
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otherwise 



A similar spectral morphing method is obtained for other representations of 
the LPC filter coefficients commonly used in vocoders such as reflection coefficients, 
log-area ratios, inverse sines fiinctions, and line spectral frequencies. 

5 For example, the GSM FR vocoder utilizes log-area ratios for representing the 

LPC synthesis filter. Given the 8 log-area ratios corresponding to a fi-ame, denoted by 
LAR(i) , / = 1,2,...8 , the spectrally morphed log-area ratios are obtained using 

LAR„^{i) = PLAR{i) (9) 

where J3 is determined according to equation (8). This method spectrally 
10 flattens the LPC synthesis filter magnitude frequency response. Alternatively, in order 
to morph the log-area ratios towards a predetermined spectrum or magnitude 
frequency response, such as the background noise spectnun represented by a set of 
log- area ratios denoted by LAR„^^^(i) , the appropriate morphing equation is 

LAR„^ii) = PLARH) + (1 - P)LAR„^^,(i) (10) 

15 The modified log-area ratios are then quantized according to the specifications 

in the standard. Note that these approaches to modification of the log-area ratios 
preserve the stabihty of the LPC synthesis filter. 

An exemplaiy approach for background noise spectrum estimation and 
representation of filter coefficients comprising log-area ratios corresponding to the 
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vocoder and an LPC filter is provided in the comfort noise generation standard [5] and 
the references therein. 

When line spectral frequencies are used for representing the LPC synthesis 
filter (e.g. the GSM EFR), an approach similar to that for log-area ratios is also 
5 appropriate. Denote the line spectral fi-equencies by f.,i = \,..M , where M is the 
order of the LPC synthesis filter which is assumed even (typical). "When the line 
spectral firequencies are evenly spaced apart firom 0 to half the sampling frequency, 
the resulting LPC synthesis filter will be all-pass (i.e. flat magnitude frequency 
response). Denote the set of line spectral frequencies corresponding to such a 
10 spectrally flat LPC filter by = 1,..,A/ . Then, the spectrally morphed line 

spectral frequencies are obtained using 



where P is determined according to equation (8). This method spectrally 
flattens the LPC synthesis filter magnitude frerquency response. Alternatively, in 
15 order to morph the line spectral frequencies towards a predetermined spectrum or 
magnitude frequency response, such as the background noise spectrum represented by 
a set of line spectral frequencies denoted by / „„^jg , the appropriate morphing equation 
is 



(11) 



(12) 
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The modified line spectral frequencies are then quantized according to the 
specifications in the standard. Note that these approaches to modification of the line 
spectral firequencies preserve the stability of the LPC synthesis filter. Appropriate 
techniques for background noise spectrum estimation and representation of filter 
5 coefficients comprising line spectral fi-equencies may be found in the corresponding 
vocoder standards on comfort noise generation. 

Minimal Delay Technique 

Large buffering, processing and transmission delays are already present in 
cellular networks without any network voice quality enhancement processing. Further 
10 network processing of the coded speech for speech enhancement purposes will add 
additional delay. Minimizing this delay is important to speech quality. In this section, 
a novel approach for minimizing the delay is discussed. The example used is the GSM 
FR vocoder. 

Figure 8 shows the order in which the coded parameters from the GSM FR 
15 encoder are received. A straightforward approach involves buffering up the entire 260 
bits for each firame and then processing these buffered bits for coded domain echo 
control purposes. However, this introduces a buffering delay of about 20ms plus the 
processing delay. 

It is possible to minimize the buffering delay as follows. First, note that the 
20 entire first subframe can be decoded as soon as bit 92 is received. Hence the first 
subfi-ame may be processed after about 7.1ms (20ms times 92/260) of buffering delay. 
Hence the buffering delay is reduced by almost 13ms. 
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When using this novel low delay approach, the coded LPC synthesis filter 
parameters are modified based on information available at the end of the first 
subframe of the frame. In other words, the entire frame is affected by the echo 
likelihood computed based on the first subframe. In experiments conducted, no 
5 noticeable artifacts were found due to this 'early' decision, particularly because the 
echo likelihood is a smoothed quantity based effectively on several previous 
subframes as well as the current subframe. 

Update of Error Correction/Detection Bits and Framing Bits 

When applying the novel coded domain processing techiques described in this 
10 report for removing echo, some are all of the bits corresponding to the coded 
parameters are modified in the bit-stream. This may affect other error-coirection or 
detection bits that may also be embedded in the bit-stream. For instance, a speech 
encoder may embed some checksums in the bit-stream for the decoder to verify to 
ensure that an error-free frame is received. Such checksums as well as any parity 
15 check bits, error correction or detection bits, and framing bits are updated in 
accordance with the appropriate standard, if necessary. 

Operation under the GSM Tandem Free Operation Standard 

If only the coded parameters are available, then partial or full decoding may be 
performed as explained earlier, whereby the coded parameters are used to reconstruct 
20 a version of the audio signal. However, when operating under a situation such as the 
GSM TFO environment, additional information is available in addition to the coded 
parameters. This additional information is the 6 MSBs of the A-law PCM samples of 
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the audio signal. In this case, these PCM samples may be used to reconstruct a version 
of the audio signal for both the far end and near end without using the coded 
parameters. This results in computational savings. 

Those skilled in the art of cormnuni cations will recognize that the preferred 
5 embodiments can be modified and ahered without departing fi-om the true spirit and 
scope of the invention as defined in the appended claims. 



WO'01/d3316 

What is claimed is 

1 . In a communications system for transmitting a near end digital signal using 
a compression code comprising a plurality of parameters including a first parameter, said 
parameters representing an audio signal comprising a plurality of audio characteristics, 

5 said compression code being decodable by a plurality of decoding steps, said 

commimications system also transmitting a far end digital signal using a compression 
code, apparatus for reducing echo in said near end digital signal comprising: 

a processor responsive to said near end digital signal to read at least said 
first parameter of said plurality of parameters, to perform at least one of said plurality of 

10 decoding steps on said near end digital signal and said far end digital signal to generate 
at least partially decoded near end signals and at least partially decoded far end signals, 
responsive to said at least partially decoded near end signals and at least partially 
decoded far end signals to adjust said first parameter to generate an adjusted first 
parameter and to replace said first parameter with said adjusted first parameter in said 

1 5 near end digital signal. 

2. Apparatus, as claimed in claim 1 , wherein said first parameter is a 
quantized first parameter and wherein said processor generates said adjusted first 
parameter in part by quantizing said adjusted first parameter before writing said adjusted 
first parameter into said near end digital signal. 

20 3. Apparatus, as claimed in claim 1 , wherein said processor is responsive to 

said at least partially decoded near end signals and said at least partially decoded far end 
signals to generate an echo likelihood signal representing the amount of echo present in 
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said partially decoded near end signals, and wherein said processor is responsive to said 
echo likelihood signal to adjust said first parameter. 

4. Apparatus, as claimed in claim 3, wherein said characteristics comprise 
spectral shape and wherein said first parameter comprises a representation of filter 

5 coefficients, and wherein said processor is responsive to said echo likelihood signal to 
adjust said representation of filter coefficients towards a magnitude firequency response. 

5. Apparatus, as claimed in claim 4, wherein said representation of filter 
coefficients comprises line spectral fi-equencies. 

6. Apparatus, as claimed in claim 4, wherein said representation of filter 
10 coefficients comprises log area ratios. 

7. Apparatus, as claimed in claim 4, wherein said magnitude firequency 
response corresponds to background noise. 

8. Apparatus, as claimed in claim 1, wherein said characteristics comprise the 
overall level of said audio signal and wherein said first parameter comprises codebook 

15 gain. 

9. Apparatus, as claimed in claim 1, wherein said first parameter comprises a 
codebook vector parameter. 

10. Apparatus, as claimed in claim 1, wherein said characteristics comprise 
period of long-term correlation and wherein said first parameter comprises a pitch period 

20 parameter. 

1 1 . Apparatus, as claimed in claim 1, wherein said characteristics comprise 
strength of long-term correlation and wherein said first parameter comprises a pitch gain 
parameter. 
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12. Apparatus, as claimed in claim 1, wherein said characteristics comprise 
spectral shape and wherein said first parameter comprises a representation of filter 
coefficients. 

13. Apparatus, as claimed in claim 12, wherein said representation of filter 
5 coefficients comprises log area ratios. 

14. Apparatus, as claimed in claim 12, wherein said representation of filter 
coefficients comprises line spectral frequencies. 

15. Apparatus, as claimed in claim 12, wherein said representation of filter 
coefficients corresponds to a linear predictive coding synthesis filter. 

10 16. Apparatus, as claimed in claim 1, wherein said first parameter corresponds 

to a first characteristic of said pliirality of audio characteristics, wherein said plurality of 
decoding steps comprises at least one decoding step avoiding substantial altering of said 
first characteristic and wherein said processor avoids performing said at least one 
decoding step. 

15 17. Apparatus, as claimed in claim 1 6, wherein said audio characteristic 

comprises power and wherein said first characteristic comprises power. 

1 8. Apparatus, as claimed in claim 16, wherein said at least one decoding step 
comprises post-filtering. 

19. Apparatus, as claimed in claim 1, wherein said compression code 
20 comprises a linear predictive code. 

20. Apparatus, as claimed in claim 1 , wherein said compression code comprises 
regular pulse excitation - long term prediction code. 

21 . Apparatus, as claimed in claim 1 , wherein said compression code comprises 
code-excited linear prediction code. 
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22. Apparatus, as claimed in claim 1, wherein said first parameter comprises a 
series of first parameters received over time, wherein said processor is responsive to said 
near end digital signal to read said series of first parameters, and wherein said processor 
is responsive to said at least partially decoded near end and far end signals and to at least 

5 a plurality of said series of first parameters to generate said adjusted first parameter. 

23. Apparatus, as claimed in claim 1, wherein said compression code is 
arranged in fi-ames of said digital signals and wherein said frames comprise a plurality of 
subfi-ames each comprising said first parameter, wherein said processor is responsive to 
said compression code to read at least said first parameter from each of said plurality of 

1 0 subfi-ames, and wherein said processor replaces said first parameter with said adjusted 
first parameter in each of said plurality of subfirames. 

24. Apparatus, as claimed in claim 23, wherein said processor reads said first 
parameter from a first of said sub frames, begins to perform at least a plurality of said 
decoding steps on said near end digital signal during said first sub frame and replaces 

15 said first parameter with said adjusted first parameter before processing a subframe 
following the first subframe so as to achieve lower delay. 

25. Apparatus, as claimed in claim 1, wherein said compression code is 
arranged in frames of said digital signals and wherein said frames comprise a plurality of 
subframes each comprising said first parameter, wherein said processor performs at least 

20 a pluraHty of said decoding steps during a first of said subframes to generate said at least 
partially decoded near end and far end signals, reads said first parameter from a second 
of said subfiimnes occurring subsequent to said first subframe, generates said adjusted 
first parameter in response to said at least partially decoded near end and far end signals 
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and said first parameter, and replaces said first parameter of said second subfi-ame with 
said adjusted first parameter. 

26. In a communications system for transmitting a near end digital signal 
comprising code samples, said code samples comprising first bits using an compression 

5 code and second bits using a linear code, said code samples representing an audio signal, 
said audio signal having a plurality of audio characteristics, said system also transmitting 
a far end digital signal, apparatus for reducing echo in said near end digital signal 
without decoding said compression code comprising: 

27. a processor responsive to said near end digital signal and said far end 
10 digital signal to adjust said first bits and said second bits. 

28. Apparatus, as claimed in claim 26, wherein said linear code comprises 
pulse code modulation (PCM) code. 

29. Apparatus, as claimed in claim 26, wherein said compression code samples 
conform to the tandem-free operation of the global system for mobile communications 

1 5 standard. 

30. Apparatus, as claimed in claim 26, wherein said first bits comprise the two 
least significant bits of said samples and wherein said second bits comprise the 6 most 
significant bits of said samples. 

3 1 . Apparatus, as claimed in claim 29, wherein said 6 most significant bits 
2Q comprise PCM code. 

32. In a communications system for transmitting a near end digital signal using 
a compression code comprising a plurality of parameters including a first parameter, said 
parameters representing an audio signal comprising a plurality of audio characteristics, 
said compression code being decodable by a plurality of decoding steps, said 
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communications system also transmitting a far end digital signal using a compression 
code, a method of reducing echo in said near end digital signal comprising: 

reading at least said first parameter of said plurality of parameters in 
response to said near end digital signal; 
5 performing at least one of said plurality of decoding steps on said near end 

digital signal and said far end digital signal to generate at least partially decoded near end 
signals and at least partially decoded far end signals; 

adjusting said first parameter in response to said at least partially decoded 
near end signals and at least partially decoded far end signals to generate an adjusted first 
10 parameter; and 

replacing said first parameter with said adjusted first parameter in said near 
end digital signal. 

33. A method, as claimed in claim 3 1 , wherein said first parameter is a 
quantized first parameter and wherein said adjusting comprises generating said adjusted 

1 5 first parameter in part by quantizing said adjusted first parameter. 

34. A method, as claimed in claim 3 1 , wherein said adjusting comprises 
generating an echo likelihood signal representing the amount of echo present in said 
partially decoded near end signals in response to said at least partially decoded near end 
signals and said at least partially decoded far end signals, and wherein said adjusting 

20 fiirther comprises adjusting said first parameter in response to said echo likelihood 
signal. 

35. A method, as claimed in claim 33, wherein said characteristics comprise 
spectral shape and wherein said first parameter comprises a representation of filter 
coefficients, and wherein said adjusting comprises adjusting said representation of filter 
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coefficients towards a magnitude frequency response in response to said echo likelihood 
signal. 

36. A method, as claimed in claim 34, wherein said representation of filter 
coefficients comprises line spectral frequencies. 
5 37. A method, as claimed in claim 34, wherein said representation of filter 

coefficients comprises log area ratios. 

38. A method, as claimed in claim 34, wherein said magnitude frequency 
response corresponds to background noise. 

39. A method, as claimed in claim 3 1 , wherein said characteristics comprise 
1 0 the overall level of said audio signal and wherein said first parameter comprises 

codebook gain. 

40. A method, as claimed in claim 3 1 , wherein said first parameter comprises a 
codebook vector parameter. 

41. A method, as claimed in claim 3 1 , wherein said characteristics comprise 

1 5 period of long-term correlation and wherein said first parameter comprises a pitch period 
parameter. 

42. A method, as claimed in claim 31, wherein said characteristics comprise 
strength of long-term correlation and wherein said first parameter comprises a pitch gain 
parameter. 

20 43 . A method, as claimed in claim 3 1 , wherein said characteristics comprise 

spectral shape and wherein said first parameter comprises a representation of filter 
coefficients. 

44. Amethod, as claimed in claim 42, wherein said representation of filter 
coefficients comprises log area ratios. 
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45. A method, as claimed in claim 42, wherein said representation of filter 
coefficients comprises line spectral frequencies. 

46. A method, as claimed in claim 42, wherein said representation of filter 
coefficients corresponds to a linear predictive coding synthesis filter. 

5 47. A method, as claimed in claim 3 1 , wherein said first parameter corresponds 

to a first characteristic of said plurality of audio characteristics, wherein said plurality of 
decoding steps comprises at least one decoding step avoiding substantial altering of said 
first characteristic and wherein said perforaaing at least a plurality of said decoding steps 
comprises avoiding performing said at least one decoding step. 

10 48. A method, as claimed in claim 46, wherein said audio characteristic 

comprises power and wherein said first characteristic comprises power. 

49. A method, as claimed in claim 46, wherein said at least one decoding step 
comprises post-filtering. 

50. A method, as claimed in claim 31, wherein said compression code 
1 5 comprises a linear predictive code. 

51. A method, as claimed in claim 3 1 , wherein said compression code 
comprises regular pulse excitation — long term prediction code. 

52. A method, as claimed in claim 31, wherein said compression code 
comprises code-excited linear prediction code. 

20 53. A method, as claimed in claim 3 1 , wherein said first parameter comprises a 

series of first parameters received over time, wherein said reading comprises reading 
said series of first parameters, and wherein said adjusting comprises generating said 
adjusted first parameter in response to said at least partially decoded near end and far end 
signals and to at least a plurality of said series of first parameters. 
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54. A method, as claimed in claim 31, wherein said compression code is 
arranged in frames of said digital signals and wherein said frames comprise a plurality of 
subframes each comprising said first parameter, wherein said reading comprises reading 
at least said first parameter from each of said plurality of subframes in response to said 

5 compression code, and wherein said replacing comprises replacing said first parameter 
with said adjusted first parameter in each of said pluraHty of subframes. 

55. A method, as claimed in claim 53, wherein said reading comprises reading 
said first parameter from a first of said subframes, wherein said performing comprises 
beginning to perform at least a plurality of said decoding steps on said near end digital 

10 signal during said first subfimne and wherein said replacing comprises replacing said 

first parameter with said adjusted first parameter before processing a subframe following 
the first subfi^me so as to achieve lower delay. 

56. A method, as claimed in claim 3 1 , wherein said compression code is 
arranged in frames of said digital signals and wherein said frames comprise a plurality of 

15 subframes each comprising said first parameter, wherein said performing comprises 

performing at least a pliirality of said decoding steps during a first of said subframes to 
generate said at least partially decoded near end and far end signals, wherein said reading 
comprises reading said first parameter from a second of said subframes occurring 
subsequent to said first subframe, wherein said adjusting comprises generating said 

20 adjusted first parameter in response to said at least partially decoded near end and far end 
signals and said first parameter, and whereiu said replacing comprises replacing said first 
parameter of said second subframe with said adjusted first parameter. 

57. In a communications system for transmitting a near end digital signal 
comprising code samples, said code samples comprising firsj bits using a compression 
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code and second bits using a linear code, said code samples representing an audio signal, 
said audio signal having a plurality of audio characteristics, said system also transmitting 
a far end digital signal, a method of reducing echo in said near end digital signal without 
decoding said compression code comprising: 
5 adjusting said first bits and said second bits in response to said near end 

digital signal and said far end digital signal. 

58. A method, as claimed in claim 56, wherein said linear code comprises 
pulse code modulation (PCM) code. 

59. A method, as claimed in claim 56, wherein said compression code samples 
10 conform to the tandem-free operation of the global system for mobile communications 

standard. 

60. A method, as claimed in claim 56, wherein said first bits comprise the two 
least significant bits of said samples and wherein said second bits comprise the 6 most 
significant bits of said samples. 

15 6 1 . A method, as claimed in claim 59, wherein said 6 most significant bits 

comprise PCM code. 
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^ (57) Abstract: A communications system (10) transmits a near end digital signal using a compression code comprising a plurality of 
^ parameters including a first parameter. The parameters represent an audio signal comprising a plurality of audio characteristics. The 
^ compression code is decodable by a plurabty of decodmg steps. The system also transmits a far end digital signal using a compression 
^ code. A terminal (20) receives the near end digital signal, and a terminal (36) receives the far end digital signal. A processor (40) is 

responsive to the near end digital signal to read at least the first parameter. The processor generates at least partially decoded near 
^ end signals and at least partially decoded far end signals. Based on such signals, the processor adjusts the first parameter and writes 
^ the adjusted first parameter into the near end digital signal. Another terminal (22) transmits the adjusted near end digital signal. As 

a result, the echo in the near end digital signal is reduced. 
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Joseph M. Butscher Reg. No. 48,326 

Stephen M. Miller Reg. No. 40.728 

Troy A. Groelken Reg. No. 46,442 

Mtehael J. FitzpatricK Reg. No. 48,510 

John A. Wiborg Reg. No. 44,401 

David Muzllla Reg. No. P-50,914 



Address all telephone calls to Lawrence M. Jarvis at telephone number: 

(312) 775-8197. 

Address all correspondence to: 

McAndrews, Held & Malloy, Ltd. 
34lh Floor 
500 W. Madison Street 
Chicago, Illinois 60661 

I hereby declare that all statements made herein of my own knowledge are true and that 
all statements made on Information and belief are believed to be true; and further that 
these statements were made with the knowledge that willful false statements and the 
like so made are punishable by fine or imprisonment, or both, under Section 1001 of 
Title 18 of the United States Code and that such willful false statements may jeopardize 
the validity of ihe application or any patent issued thereon. 

This declaration nsunes 2 Inventors below. 



Information about sole or first inventor: 

(given name, family name): Ravi Chandran 

Residence: 18082 East Courtiand Drive 
South Bend. IN 46637 

Citizenship: U.S.A. 
Post Office Address: Same 



First inventor's signature: 
Date Signed: 
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(given name, family name): cdanieKJ, MambcJi- 

Residence: 14984 Wesl Clear Lake Road .A. 

_Bi|chanarL. Ml 49107 Al- 
Citizenship: U.S.A. 
Post Office Address: Same 



Second inventor's signature: 



Date Signed: 3j/i/o^ 
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